Search CORE

21 research outputs found

Component-based Attention for Large-scale Trademark Retrieval

Author: Denman Simon
Fookes Clinton
Mau Sandra
Sivapalan Sabesan
Sridharan Sridha
Tursun Osman
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/10/2019
Field of study

The demand for large-scale trademark retrieval (TR) systems has significantly increased to combat the rise in international trademark infringement. Unfortunately, the ranking accuracy of current approaches using either hand-crafted or pre-trained deep convolution neural network (DCNN) features is inadequate for large-scale deployments. We show in this paper that the ranking accuracy of TR systems can be significantly improved by incorporating hard and soft attention mechanisms, which direct attention to critical information such as figurative elements and reduce attention given to distracting and uninformative elements such as text and background. Our proposed approach achieves state-of-the-art results on a challenging large-scale trademark dataset.Comment: Fix typos related to authors' informatio

arXiv.org e-Print Archive

Queensland University of Technology ePrints Archive

MTRNet: A Generic Scene Text Eraser

Author: Denman Simon
Fookes Clinton
Sivapalan Sabesan
Sridharan Sridha
Tursun Osman
Zeng Rui
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2019
Field of study

Text removal algorithms have been proposed for uni-lingual scripts with regular shapes and layouts. However, to the best of our knowledge, a generic text removal method which is able to remove all or user-specified text regions regardless of font, script, language or shape is not available. Developing such a generic text eraser for real scenes is a challenging task, since it inherits all the challenges of multi-lingual and curved text detection and inpainting. To fill this gap, we propose a mask-based text removal network (MTRNet). MTRNet is a conditional adversarial generative network (cGAN) with an auxiliary mask. The introduced auxiliary mask not only makes the cGAN a generic text eraser, but also enables stable training and early convergence on a challenging large-scale synthetic dataset, initially proposed for text detection in real scenes. What's more, MTRNet achieves state-of-the-art results on several real-world datasets including ICDAR 2013, ICDAR 2017 MLT, and CTW1500, without being explicitly trained on this data, outperforming previous state-of-the-art methods trained directly on these datasets.Comment: Presented at ICDAR2019 Conferenc

arXiv.org e-Print Archive

Crossref

Queensland University of Technology ePrints Archive

Learning Test-time Data Augmentation for Image Retrieval with Reinforcement Learning

Author: Denman Simon
Fookes Clinton
Sridharan Sridha
Tursun Osman
Publication venue
Publication date: 08/02/2021
Field of study

Off-the-shelf convolutional neural network features achieve outstanding results in many image retrieval tasks. However, their invariance is pre-defined by the network architecture and training data. Existing image retrieval approaches require fine-tuning or modification of the pre-trained networks to adapt to the variations in the target data. In contrast, our method enhances the invariance of off-the-shelf features by aggregating features extracted from images augmented with learned test-time augmentations. The optimal ensemble of test-time augmentations is learned automatically through reinforcement learning. Our training is time and resources efficient, and learns a diverse test-time augmentations. Experiment results on trademark retrieval (METU trademark dataset) and landmark retrieval (Oxford5k and Paris6k scene datasets) tasks show the learned ensemble of transformations is effective and transferable. We also achieve state-of-the-art MAP@100 results on the METU trademark dataset

arXiv.org e-Print Archive

Queensland University of Technology ePrints Archive

Towards Self-Explainability of Deep Neural Networks with Heatmap Captioning and Large-Language Models

Author: Denman Simon
Fookes Clinton
Sridharan Sridha
Tursun Osman
Publication venue
Publication date: 04/04/2023
Field of study

Heatmaps are widely used to interpret deep neural networks, particularly for computer vision tasks, and the heatmap-based explainable AI (XAI) techniques are a well-researched topic. However, most studies concentrate on enhancing the quality of the generated heatmap or discovering alternate heatmap generation techniques, and little effort has been devoted to making heatmap-based XAI automatic, interactive, scalable, and accessible. To address this gap, we propose a framework that includes two modules: (1) context modelling and (2) reasoning. We proposed a template-based image captioning approach for context modelling to create text-based contextual information from the heatmap and input data. The reasoning module leverages a large language model to provide explanations in combination with specialised knowledge. Our qualitative experiments demonstrate the effectiveness of our framework and heatmap captioning approach. The code for the proposed template-based heatmap captioning approach will be publicly available

arXiv.org e-Print Archive

A facile solid-state heating method for preparation of poly(3,4-ethelenedioxythiophene)/ZnO nanocomposite and photocatalytic activity

Author: Ahmat Ali
Ruxangul Jamal
Tursun Abdiryim
Yakupjan Osman
Yu Zhang
Publication venue: Springer Nature
Publication date: 20/02/2014
Field of study

Poly(3,4-ethylenedioxythiophene)/zinc oxide (PEDOT/ZnO) nanocomposites were prepared by a simple solid-state heating method, in which the content of ZnO was varied from 10 to 20 wt%. The structure and morphology of the composites were characterized by Fourier transform infrared (FTIR) spectroscopy, ultraviolet-visible (UV-vis) absorption spectroscopy, X-ray diffraction (XRD), and transmission electron microscopy (TEM). The photocatalytic activities of the composites were investigated by the degradation of methylene blue (MB) dye in aqueous medium under UV light and natural sunlight irradiation. The FTIR, UV-vis, and XRD results showed that the composites were successfully synthesized, and there was a strong interaction between PEDOT and nano-ZnO. The TEM results suggested that the composites were a mixture of shale-like PEDOT and less aggregated nano-ZnO. The photocatalytic activity results indicated that the incorporation of ZnO nanoparticles in composites can enhance the photocatalytic efficiency of the composites under both UV light and natural sunlight irradiation, and the highest photocatalytic efficiency under UV light (98.7%) and natural sunlight (96.6%) after 5 h occurred in the PEDOT/15wt%ZnO nanocomposite

Springer - Publisher Connector

PubMed Central

MTRNet++: One-stage Mask-based Scene Text Eraser

Author: Denman Simon
Fookes Clinton
Sivapalan Sabesan
Sridharan Sridha
Tursun Osman
Zeng Rui
Publication venue: 'Elsevier BV'
Publication date: 04/06/2020
Field of study

A precise, controllable, interpretable and easily trainable text removal approach is necessary for both user-specific and large-scale text removal applications. To achieve this, we propose a one-stage mask-based text inpainting network, MTRNet++. It has a novel architecture that includes mask-refine, coarse-inpainting and fine-inpainting branches, and attention blocks. With this architecture, MTRNet++ can remove text either with or without an external mask. It achieves state-of-the-art results on both the Oxford and SCUT datasets without using external ground-truth masks. The results of ablation studies demonstrate that the proposed multi-branch architecture with attention blocks is effective and essential. It also demonstrates controllability and interpretability.Comment: This paper is under CVIU review (after major revision

arXiv.org e-Print Archive

Queensland University of Technology ePrints Archive

Missing ingredients in optimising large-scale image retrieval with deep features

Author: Tursun Osman
Publication venue: 'Queensland University of Technology'
Publication date: 01/01/2022
Field of study

This thesis applies advanced image processing and deep machine learning techniques to solve the challenges of large-scale image retrieval. Solutions are provided to overcome key obstacles in real-world large-scale image retrieval applications by introducing unique methods for making deep learning systems more reliable and efficient. The outcome of the research is useful for several image retrieval applications including patent search, and trademark and logo infringement analysis

Queensland University of Technology ePrints Archive

METU dataset: A big dataset for benchmarking trademark retrieval

Author: Kalkan Sinan
Tursun Osman
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/05/2015
Field of study

Trademark retrieval (TR) is the problem of retrieving similar trademarks (logos) for a query, and the main aim is to detect copyright infringements in trademarks. Since there are millions of companies worldwide, automatically retrieving similar trademarks has become an important problem, and currently, checking trademark infringements is mostly performed manually by humans. However, although there have been many attempts for automated TR, as also acknowledged in the community, the problem is largely unsolved. One of the main reasons for that is the unavailability of a publicly available comprehensive dataset that includes the various challenges of the TR problem. In this article, we propose and introduce a large dataset composed of more than 930,000 trademarks, and evaluate the existing approaches in the literature on this dataset. We show that the existing methods are far from being useful in such a challenging dataset, and we hope that the dataset can facilitate the development of better methods to make progress in the performance of trademark retrieval systems

Crossref

Queensland University of Technology ePrints Archive

OpenMETU (Middle East Technical University)

Noisy Uyghur Text Normalization

Author: Akici Ruket C.
Tursun Osman
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

Uyghur is the second largest and most actively used social media language in China. However, a non-negligible part of Uyghur text appearing in social media is unsystematically written with the Latin alphabet, and it continues to increase in size. Uyghur text in this format is incomprehensible and ambiguous even to native Uyghur speakers. In addition, Uyghur texts in this form lack the potential for any kind of advancement for the NLP tasks related to the Uyghur language. Restoring and preventing noisy Uyghur text written with unsystematic Latin alphabets will be essential to the protection of Uyghur language and improving the accuracy of Uyghur NLP tasks. To this purpose, in this work we propose and compare the noisy channel model and the neural encoderdecoder model as normalizing methods. </p

Queensland University of Technology ePrints Archive

A Large-scale Dataset and Benchmark for Similar Trademark Retrieval

Author: Aker Cemal
Kalkan Sinan
Tursun Osman
Publication venue
Publication date: 14/10/2017
Field of study

Trademark retrieval (TR) has become an important yet challenging problem due to an ever increasing trend in trademark applications and infringement incidents. There have been many promising attempts for the TR problem, which, however, fell impracticable since they were evaluated with limited and mostly trivial datasets. In this paper, we provide a large-scale dataset with benchmark queries with which different TR approaches can be evaluated systematically. Moreover, we provide a baseline on this benchmark using the widely-used methods applied to TR in the literature. Furthermore, we identify and correct two important issues in TR approaches that were not addressed before: reversal of contrast, and presence of irrelevant text in trademarks severely affect the TR methods. Lastly, we applied deep learning, namely, several popular Convolutional Neural Network models, to the TR problem. To the best of the authors, this is the first attempt to do so

arXiv.org e-Print Archive

OpenMETU (Middle East Technical University)